Creating software for environmental measurements based on GPS data - twsagps package report number 2
1 Introduction
This report comprehends the present state of twsagps package - current issues and further development plans. Package consists of five functions: exposure_PO(), exposure_KDE(), exposure_DR(), exposure_LS() and rast_stats(). The first four functions each utilize one of Time Weighted Spatial Averaging (TWSA) method. Fifth function calculates various statistics for multiple rasters. The TWSA methods are taken from the reference paper (Jankowska et al. 2023) on which the package is based upon - Point Overlay (PO), Kernel Density Estimation (KDE), Density Ranking (DR) with the addition of Line Segment (LS). Report demonstrates code workflow and outcomes.
For this demo following packages were used:
library(terra)
library(exactextractr)
library(sf)
library(tmap)
library(tmaptools)
library(dplyr)
library(twsagps)2 Research area and sample data
For presenting package functioning Geolife GPS Trajectories 1.3 Data (Zheng et al. 2009), (Zheng et al. 2008), (Yu Zheng 2010) was used as sample GPS data. The Geolife dataset consists of significant volumes of time-stamped GPS data retrieved from 182 pariticipants in over five year time span. Research area was limited to San Diego county and GPS data was restricted to the research area. To improve running time of functions Geolife data was reduced to each 10th record. The GPS data was collected between 17th August 2011 and 20th August 2011.
geol_file = read.csv("geolife_san_diego.csv")
geolife_days = geol_file |> select(-X) |>
slice(which(row_number() %% 10 == 1))
geolife_time = geolife_days |>
mutate(dateTime = paste(date, time),
dateTime = as.POSIXct(dateTime, format = "%Y-%m-%d %H:%M:%S", tz = "UTC"))| lat | lon | date | time | dateTime |
|---|---|---|---|---|
| 33.26486 | -117.4323 | 2011-08-17 | 21:38:11 | 2011-08-17 21:38:11 |
| 33.26239 | -117.4301 | 2011-08-17 | 21:38:21 | 2011-08-17 21:38:21 |
| 33.25969 | -117.4280 | 2011-08-17 | 21:38:31 | 2011-08-17 21:38:31 |
| 33.25700 | -117.4259 | 2011-08-17 | 21:38:41 | 2011-08-17 21:38:41 |
| 33.25437 | -117.4238 | 2011-08-17 | 21:38:51 | 2011-08-17 21:38:51 |
| 33.25162 | -117.4216 | 2011-08-17 | 21:39:01 | 2011-08-17 21:39:01 |
NDVI raster data was applied as sample environmental layer. Indicator was calculated from a Landsat 8 image involving San Diego county which was captured on 22nd Februrary 2024.
landsat_red = rast("landsat_ndvi/LC09_L2SP_040037_20240222_20240224_02_T1_SR_B4.TIF")
landsat_nir = rast("landsat_ndvi/LC09_L2SP_040037_20240222_20240224_02_T1_SR_B5.TIF")
landsat_ndvi = (landsat_nir - landsat_red) / (landsat_nir + landsat_red)
Vector environmental layer of bike routes lines was utilized as consecutive sample data. Cycling paths span over San Diego county and were sourced from City of San Diego Open Data portal updated at 2024-04-30. Bicycle trail data is reclassified to numeric type based on class attribute indicating category of bike lane.
Class I - Paved right-of-way for exclusive use by bicyclists, pedestrians and those using non-motorized modes of travel. They are physically separated from vehicular traffic and can be constructed in roadway right-of-way or exclusive right-of-way (Reclassified as 3)
Class II - One-way facilities on either side of a roadway with pavement striping and signage used to allocate a portion of a roadway for exclusive or preferential bicycle travel (Reclassified as 2)
Class III - Shared use with motor vehicle traffic within the same travel lane. Designated by signs (Reclassified as 1)
Class IV (One-Way)/Class IV (Two-Way) - Exclusive bike facility physically separated from motor traffic and distinct from the sidewalk (Reclassified as 4)
sd_bike = vect("san_diego_bike/bike_routes_datasd.shp")
sd_bike$class = replace(sd_bike$class, sd_bike$class == "Class IV (Two-Way)", "Class IV (One-Way)")
sd_bike$class = factor(sd_bike$class, levels = c("Class III", "Class II", "Class I", "Class IV (One-Way)")) |> as.numeric()All sample data is avaible at repository.
3 Methods
3.1 Point Overlay (PO)
Point Overlay method is implemented in exposure_PO() function. Evaluated by converting GPS point SpatVector data to SpatRaster data type using sum function.
3.2 Kernel Density Estimation (KDE)
The exposure_KDE() utilizes the KDE method. Method generates smoothed buffers aroud GPS data considering differences in spatial distribution and influence. Function calculates exposure from GPS points coordinates on matrix of coordinates using spat_kde() created for this package. The spat_kde() function, used in computing KDE, utilizes quartic kernel (Silverman 1986):
\hat{f}(x) = \frac{1}{h^2} \sum^n_{i=1} K \{\frac{1}{h}(x - X_i)\}
and
K_2(x) = \left\{ \begin{array}{rcl} 3\pi^{-1}(1-x^Tx)^2 & if & x^Tx > 1 \\ 0 & \ & otherwise \end{array}\right.
where h is kernel search radius (bandwidth), x current cell coordinates and X_i current data point. KDE formula does not incorporate further division by n number of cells for the purpose of calculating KDE in intensity unit (https://gis.stackexchange.com/a/389744).
3.3 Density Ranking (DR)
Density Ranking method in exposure_DR() is based on DR() function from yenchic/density_ranking GitHub repository [Chen and Dobra (2020)](Chen 2019) by implementing Empirical Cumulative Distrubution Function in spat_dr() function created fot this package. DR serves as a improvement to KDE method by ranking KDE values (Jankowska et al. 2023). Function implements DR formula (Chen and Dobra 2020):
\hat{\alpha}(x) = \sum^n_{i=1}I(\hat{p}(X_i)\leq\hat{p}(x))
The \hat{\alpha}(x) takes value between 0 and 1. Lack of n number of cells division in formula mentioned in Section 3.2.
3.4 Line Segment (LS)
New addition in TWSA approaches is a Line Segment method applied in exposure_LS() drawn from michaeldgarber/microclim-static-v-dynam GitHub repository. Code applied in exposure_LS() consists of modified functions from the repository’s demo. Line Segment puts emphasis on time component of GPS data. Firstly, trajectories following GPS movement. Than buffers around each trajectory segment are created and combined with time elapsed between every two GPS records. Secondly, weighted values for each segment are extracted from environmental layer. Obtained polygons are converted to raster with sum function afterwards.
Demo applies terra::extract() in deriving environmental data values which is substituted in exposure_LS() with exactextractr::exact_extract() function due to significant computation speed difference.
4 Processing
4.1 Parameters
TWSA functions of twsagps package utilizes various arguments:
data - data.frame, SpatVector points or sf data.frame containing only POINTS of GPS data
x - x or longitude coordinates column name if data is a data.frame
y - y or latitude coordinates column name if data is a data.frame
NA_val - value in x and y marked as NA if data is a data.frame
time_data (LS) - column of data containing POSIXct class objects
time_unit (LS) - unit of time weights in time_data
cellsize - size of raster cells in meters if output has a longtitude/latitude CRS or in the units of output CRS
group_split - column of data based on which it is grouped and split. For n groups output will be n SpatRasters
bandwidth (KDE/DR/LS) - bandwidth in units of output CRS (DR/KDE) or size of buffer in meters if output has a longtitude/latitude CRS or in the units of output CRS (LS)
env_data - stars, SpatRaster, SpatVector or sf object class of environmental data. Activity space is calculated when not set
env_field - column name of SpatVector/sf env_data that env_data will be rasterized on
env_buff - optional buffer around SpatVector/sf env_data in meters if output has a longtitude/latitude CRS or in the units of the CRS
normalize - if TRUE activity space SpatRaster is normalized
norm_method - normalization method - “center”, “scale”, “standardize” or “range”. Default is “range”
norm_group - if FALSE each activity space SpatRaster is normalized seperately. If TRUE activity space SpatRasters are normalized to SpatRaster with highest max value. Not ignored when normalize is TRUE, norm_method is “range” and group_split is set
time_only_norm (LS) - TEMPORARY - type of time normalization. If TRUE only time_data is normalized. If FALSE and norm_method is “range” normalization is conducted with regard to spatial distribution
grid_extent - if stars or SpatRaster grid_extent is output’s grid. If SpatExtent, sf bbox object or numeric vector c(xmin, xmax, ymin, ymax) grid_extent is output’s extent
input_crs- CRS of data’s coordinates if data is a data.frame
output_crs- CRS of output
filepath - output filename
data = geolife_time
x = "lon"
y = "lat"
NA_val = NULL
time_data = "dateTime"
time_unit = "mins"
cellsize = 50
group_split = "date"
bandwidth = 200
# ndvi env_data
env_data = landsat_ndvi
env_field = NULL
env_buff = NULL
# bike env_data
env_data = sd_bike
env_field = "class"
env_buff = 50
normalize = TRUE
norm_method = "range"
norm_group = TRUE
time_only_norm = FALSE
grid_extent = NULL
input_crs = "EPSG:4326"
output_crs = crs(landsat_ndvi)
filepath = NULL4.2 Entry error handling
With the aim of enhancing versatility of TWSA functions multiple types of objects are managed as arguments. Unhandled and handled classes are filtered and processed to uniform type returning proper messages, warnings and errors.
Input of environmental data must be stars, SpatRaster, SpatVector or sf object class. Stars is converted into SpatRaster and sf into SpatVector.
# handle stars objects rasters as env_data
if (!is.null(env_data)) {
if (any(class(env_data) == "stars")){
env_data = terra::rast(env_data)
} else if (any(class(env_data) == "sf")) {
env_data = terra::vect(env_data)
} else if (any(!class(env_data) %in% c("SpatVector", "SpatRaster"))){
stop("Invalid env_data - env_data neither stars, sf, SpatVector nor SpatRaster class")
}
}Grid extent argument handles stars, SpatRaster, SpatExtent, sf bbox class objects and numeric vector of 4 values indicating boundaries of extent (xmin, xmax, ymin, ymax). For grid extent stars class the object is transformed to SpatRaster and sf bbox or vector to SpatExtent.
# handle grid_extent
if (!is.null(grid_extent)){
if (any(class(grid_extent) == "stars")){
grid_extent = terra::rast(grid_extent)
} else if (any(class(grid_extent) == "bbox" || (is.vector(grid_extent) &&
all(class(grid_extent) == "numeric") && length(grid_extent) == 4))){
grid_extent = terra::ext(grid_extent)
} else if (any(!class(grid_extent) %in% c("SpatExtent", "SpatRaster"))){
stop("Invalid grid_extent - grid_extent neither stars, SpatRaster, SpatExtent, bbox class nor numeric vector of 4 length")
}
}Normalize and groups normalization arguments are boolean. They undergo tests to ascertain interpretability as logical, subsequently converted to logical.
# handle normalize and norm_group
if (is.na(as.logical(normalize))){
stop("Invalid normalize argument - normalize argument cannot be interpreted as boolean")
} else {
normalize = as.logical(normalize)
}
if (is.na(as.logical(norm_group))){
stop("Invalid norm_group argument - norm_group argument cannot be interpreted as boolean")
} else {
norm_group = as.logical(norm_group)
}Normalization method is required to be one of normalization methods from BBmisc::normalize(). When normalization is conducted and method is not center, scale, standardize or range, argument is assigned to default method - range.
# handle norm_method
if (!norm_method %in% c("center", "scale", "standardize", "range") && normalize == TRUE) {
warning('Invalid norm_method - applying default normalization method "range"')
norm_method = "range"
}4.2.1 Kernel Density Estimation/Density Ranking/Line Segment
In exposure_KDE(), exposure_DR() and exposure_LS() obligatory bandwidth argument is implemented. Bandwidth is constrained to be single positive numeric value.
#handle bandwidth
if (is.null(bandwidth)){
stop("Missing bandwidth argument. Provide valid bandwidth")
} else if (length(bandwidth) != 1 || !is.numeric(bandwidth) || bandwidth <= 0){
stop("Invalid bandwidth argument - bandwidth is neither positive nor single numeric value")
}4.2.2 Line Segment
Line Segment temporary time only norm argument (see Section 6.2) is expected to be logical and is handled analogously to normalize and groups normalization arguments.
# handle time_only_norm
if (is.na(as.logical(time_only_norm))){
stop("Invalid time_only_norm argument - time_only_norm argument cannot be interpreted as boolean")
} else {
time_only_norm = as.logical(time_only_norm)
}4.3 Entry processing
Considering numerous GPS data types, grid extent, environmental layer and CRS arguments GPS data is required to be preprocessed prior to further computation. Initially data frame GPS dataset class is handled. Obligatory for developing spatial data x and y coordinates columns are validated. Afterwards star_processing() function is called.
# get spatial data with correct crs
if (!is.null(data)){
if (all(class(data) == "data.frame")){
if (all(!c(is.null(x), is.null(y)))){
x_enq = rlang::enquo(x)
y_enq = rlang::enquo(y)
if (all(c(rlang::quo_name(y_enq), rlang::quo_name(y_enq)) %in% colnames(data))){
data_proj = start_processing(data = data, x = rlang::quo_name(x_enq),
y = rlang::quo_name(y_enq), NA_val = NA_val,
env_data = env_data, grid_extent = grid_extent,
input_crs = input_crs, output_crs = output_crs)
} else {
stop("Invalid x or y arguments - x or y are not a column in data")
}
} else {
stop('Missing x or y arguments for data "data.frame" class')
}
} else {
data_proj = start_processing(data = data, env_data = env_data,
grid_extent = grid_extent, input_crs = input_crs,
output_crs = output_crs)
}
} else {
stop("Missing data argument - provide valid data argument")
}4.3.1 Start_processing function
The start_processing() evaluates target SpatVector points GPS data class incorporating exclusion of NA values and CRS transformations with properly signalled conditions.
Code
start_processing = function(data, x, y, NA_val, env_data, grid_extent,
input_crs, output_crs){
# get spatial data
if (all(class(data) == "data.frame")) {
if (is.null(input_crs)){ # implement so missing works
input_crs = ""
}
if (!is.null(NA_val)) {
if (is.numeric(NA_val)){
n_row = nrow(data)
data = data[data[x] != NA_val & data[y] != NA_val,]
message(paste0("Removing rows containing default NA value ",
NA_val, " in x and y columns - removing ", n_row - nrow(data), " rows"))
} else {
warning("Invalid NA_val argument - NA_val is not numeric. Ignoring NA_val argument")
}
}
if (any(is.na(data[,c(x,y)]))){
n_row_NA = nrow(data)
data = tidyr::drop_na(data, x, y)
message(paste0("Removing rows containing NA in x and y columns - removing", n_row_NA - nrow(data), " rows"))
}
data_points = terra::vect(x = data, geom = c(x, y), crs = input_crs)
} else if (any(class(data) == "sf") && all(sf::st_geometry_type(data) == "POINT")){
data_points = terra::vect(data)
if (!is.null(input_crs)){
message("Ignoring input_crs argument")
}
} else if (any(class(data) == "SpatVector") && terra::geomtype(data) == "points") {
data_points = data
if (!is.null(input_crs)){
message("Ignoring input_crs argument")
}
} else {
stop("Object data is neither data.frame, sf nor SpatVector class")
}
data_crs = terra::crs(data_points)
# crs
if (!is.null(output_crs)){
if (data_crs == "") {
message("Setting data crs to output_crs")
terra::crs(data_points) = output_crs
data_proj = data_points
} else {
message("Projecting data to output_crs")
data_proj = terra::project(data_points, output_crs)
}
} else if (!is.null(grid_extent) && class(grid_extent) == "SpatRaster") {
if (data_crs == "") {
message("Setting data crs to grid_extent crs")
terra::crs(data_points) = terra::crs(grid_extent)
data_proj = data_points
} else {
message("Projecting data to grid_extent crs")
data_proj = terra::project(data_points, grid_extent)
}
} else if (data_crs != "") { # any invalid/empty crs
data_proj = data_points
} else if (!is.null(env_data)) {
message("Setting data crs to environmental data crs")
terra::crs(data_points) = terra::crs(env_data)
data_proj = data_points
} else {
message("No crs specified")
data_proj = data_points
}
# crop to grid_extent
if (!is.null(grid_extent)){
if (class(grid_extent) == "SpatExtent"){
data_proj = terra::crop(data_proj, grid_extent)
} else {
if (terra::crs(data_proj) != terra::crs(grid_extent)){
grid_rast = terra::project(grid_rast, terra::crs(data_proj))
}
data_proj = terra::crop(data_proj, terra::ext(grid_extent))
}
}
return(data_proj)
}At the outset function handles various GPS dataset classes. For data frame class object input CRS is tested. Thereafter, rows containing value from NA_val argument or NA value in x and y columns are omitted. Utilizing processed coordinates columns SpatVector object with input CRS is generated. In cases of sf data frame containing only POINTS GPS data is converted to SpatVector. Points SpatVector and sf class ignore input CRS.
# get spatial data
if (all(class(data) == "data.frame")) {
if (is.null(input_crs)){
input_crs = "" # create vector with empty crs
}
if (!is.null(NA_val)) {
if (is.numeric(NA_val)){
n_row = nrow(data)
data = data[data[x] != NA_val & data[y] != NA_val,]
message(paste0("Removing rows containing default NA value ",
NA_val, " in x and y columns - removing ", n_row - nrow(data), " rows"))
} else {
warning("Invalid NA_val argument - NA_val is not numeric. Ignoring NA_val argument")
}
}
if (any(is.na(data[,c(x,y)]))){
n_row_NA = nrow(data)
data = tidyr::drop_na(data, x, y)
message(paste0("Removing rows containing NA in x and y columns - removing", n_row_NA - nrow(data), " rows"))
}
data_points = terra::vect(x = data, geom = c(x, y), crs = input_crs)
} else if (any(class(data) == "sf") && all(sf::st_geometry_type(data) == "POINT")){
data_points = terra::vect(data)
if (!is.null(input_crs)){
message("Ignoring input_crs argument")
}
} else if (any(class(data) == "SpatVector") && terra::geomtype(data) == "points") {
data_points = data
if (!is.null(input_crs)){
message("Ignoring input_crs argument")
}
} else {
stop("Object data is neither data.frame, sf nor SpatVector class")
}Subsequently, start_process() function manages setting and transformation of GPS CRS. Operation has regard to following hierarchy - output CRS, SpatRaster grid extent, not empty currect SpatVector CRS and environmental dataset. Whilst none of conditions mentioned beforehand are fulfilled, the SpatVector’s CRS remains empty.
data_crs = terra::crs(data_points)
# crs
if (!is.null(output_crs)){
if (data_crs == "") {
message("Setting data crs to output_crs")
terra::crs(data_points) = output_crs
data_proj = data_points
} else {
message("Projecting data to output_crs")
data_proj = terra::project(data_points, output_crs)
}
} else if (!is.null(grid_extent) && class(grid_extent) == "SpatRaster") {
if (data_crs == "") {
message("Setting data crs to grid_extent crs")
terra::crs(data_points) = terra::crs(grid_extent)
data_proj = data_points
} else {
message("Projecting data to grid_extent crs")
data_proj = terra::project(data_points, grid_extent)
}
} else if (data_crs != "") { # any invalid/empty crs
data_proj = data_points
} else if (!is.null(env_data)) {
message("Setting data crs to environmental data crs")
terra::crs(data_points) = terra::crs(env_data)
data_proj = data_points
} else {
message("No crs specified")
data_proj = data_points
}Projecting data to output_crs
Finally, SpatVector is cropped to grid extent and returns processed GPS points SpatVector.
# crop to grid_extent
if (!is.null(grid_extent)){
if (class(grid_extent) == "SpatExtent"){
data_proj = terra::crop(data_proj, grid_extent)
} else {
if (terra::crs(data_proj) != terra::crs(grid_extent)){
grid_rast = terra::project(grid_rast, terra::crs(data_proj))
}
data_proj = terra::crop(data_proj, terra::ext(grid_extent))
}
}4.3.2 Line Segment
After processing CRS and SpatVector data in start_processing() Line Segment performs additional validation for time data and omits records with NA value.
# if time_data remove points with NA time_data
if (!is.na(time_data)){
enq_time = rlang::enquo(time_data)
time_cname = rlang::quo_name(enq_time)
if (!time_cname %in% terra::names(data_proj)){
stop("Invalid time_data argument - time_data is not a column name in data")
} else if (any(is.na(data_proj[time_cname]))){
n_row = nrow(data_proj)
data_proj = tidyterra::drop_na(data_proj, time_cname)
message(paste0("Removing points with NA ", time_cname, " values - removing ", n_row - nrow(data_proj), " rows"))
}
}4.4 Grid Raster
Upon managing CRS and numerous class objects computing of grid raster is conducted, later utilized in output raster calculation. The grid_calc() function incorporates cellsize, bandwidth, grid extent and environmental raster layer arguments in evaluation of grid raster. In scenarios where the grid extent is a SpatRaster, it is regarded as grid raster, with the calculation of the latter omitted.
if (!is.null(grid_extent) && any(class(grid_extent) == "SpatRaster")){
if (terra::crs(data_proj) != terra::crs(grid_extent)){
grid_extent = terra::project(grid_extent, terra::crs(data_proj))
}
grid_rast = grid_extent
} else {
grid_rast = calc_grid(x = data_proj, cellsize = cellsize, env_data = env_data,
grid_extent = grid_extent, bandwidth = bandwidth, is_LS = TRUE)
}4.4.1 Calc_grid function
Code
calc_grid = function(x, bandwidth, cellsize, env_data, grid_extent, is_LS = FALSE){
if (!is.null(grid_extent)){ # grid_extent as ext of grid rast
extent = grid_extent
} else { # calculate extent
extent = terra::ext(x)
if (!is.null(bandwidth)) { # for KDE/DR expand extent by bandwidth
if(is_LS == FALSE){
extent = c(terra::xmin(extent) - bandwidth, terra::xmax(extent) + bandwidth,
terra::ymin(extent) - bandwidth, terra::ymax(extent) + bandwidth)
} else { # for LS in lon/lat crs bandwidth is still in meters (terra::buffer)
temp_buff = terra::buffer(x, bandwidth)
extent = terra::ext(temp_buff)
}
}
}
if(!is.null(cellsize) && length(cellsize) == 1 && is.numeric(cellsize) && cellsize > 0) { # cellsize included
if (suppressWarnings(!is.na(terra::is.lonlat(x)) && terra::is.lonlat(x))){
# crs units in degrees
# is.na if empty crs to skip error
warning("Cellsize is not stable - cells are not rectangular")
if (!is.null(bandwidth) && bandwidth > 0.1 && is_LS == FALSE) {
message(paste0("CRS is in lontitude/latitude and bandwidth is ", bandwidth,
" - bandwidth is calculated in CRS units. Is bandwidth in correct unit?"))
}
dist_lon = geosphere::distm(c(extent[1], extent[3]), c(extent[2], extent[3]),
fun = geosphere::distHaversine)
dist_lat = geosphere::distm(c(extent[1], extent[3]), c(extent[1], extent[4]),
fun = geosphere::distHaversine)
# number of cells
x_cells = (dist_lon / cellsize) |> as.integer()
y_cells = (dist_lat / cellsize) |> as.integer()
# empty_rast for units in degrees
grid_rast = terra::rast(crs = terra::crs(x), nrows=y_cells,
ncols=x_cells, extent = extent)
} else {
grid_rast = terra::rast(crs = terra::crs(x), extent = extent,
resolution = cellsize)
}
} else if (!is.null(env_data) && any(class(env_data) == "SpatRaster")){ #if incorrect cellsize and env_data exists
warning("Cellsize invalid or not set - using cellsize from env_data")
if (terra::linearUnits(env_data) == terra::linearUnits(x)){
# project env data with the same cellsize
env_data_proj = terra::project(env_data, terra::crs(x), res = terra::res(env_data)[1])
# if env_data in lat/lon only quadratic cells
} else {
env_data_proj = terra::project(env_data, terra::crs(x))
}
if (is.null(grid_extent)){
# extent modified to preserve cellsize
grid_rast = terra::crop(env_data_proj, extent)
} else {
# if grid_extent specified and missing cellsize grid_extent is prior to
# cellsize and cellsize is slightly modified
grid_rast = env_data_proj
terra::ext(grid_rast) = extent
}
} else {
stop("Invalid or is.null cellsize - provide valid cellsize or env_data")
}
return(grid_rast)
}At first grid_calc() function evaluates extent of grid rast. For grid extent as SpatExtent, sf bbox object or vector it acts as extent for grid rast. Otherwise, extent is derived from GPS SpatVector data. DR, KDE and LS methods extents are additionally extended for the purpose of preventing cropping cells with values. Kernel Density Estimation and Density Ranking extensions apply bandwidth value. Line Segment, utilizing terra::buffer(), buffers in longitude/latitude CRS are calculated in meters. Therefore, extension by bandwidth in longitude/latitude CRS unit is not viable. Subsequently, the extent is based on boundary box of buffered GPS SpatVector layer.
if (!is.null(grid_extent)){ # grid_extent as ext of grid rast
extent = grid_extent
} else { # calculate extent
extent = terra::ext(data_proj)
if (!is.null(bandwidth)) { # for KDE/DR expand extent by bandwidth
if(is_LS == FALSE){
extent = c(terra::xmin(extent) - bandwidth, terra::xmax(extent) + bandwidth,
terra::ymin(extent) - bandwidth, terra::ymax(extent) + bandwidth)
} else { # for LS in lon/lat crs bandwidth is still in meters (terra::buffer)
temp_buff = terra::buffer(data_proj, bandwidth)
extent = terra::ext(temp_buff)
}
}
}Afterwards, computation of grid raster using cellsize, extent and SpatVector CRS is performed. Number of cells for longitude/latitude CRS is calculated individually implementing spherical distance to evaluate non-rectangular cells. Additionally, KDE and DR execute further bandwidth size and unit verification with the aim of feasible prevention of excessive memory usage in method specific calculations (Section 4.8). For missing or incorrect cellsize argument environmental SpatRaster is applied. In scenarios where grid extent is set, extent is preserved at the expense of environmental cellsize. Otherwise, extent is modified to maintain cellsize.
if(!is.null(cellsize) && length(cellsize) == 1 && is.numeric(cellsize) && cellsize > 0) { # cellsize included
if (suppressWarnings(!is.na(terra::is.lonlat(data_proj)) && terra::is.lonlat(data_proj))){
# crs units in degrees
# is.na if empty crs to skip error
warning("Cellsize is not stable - cells are not rectangular")
if (!is.null(bandwidth) && bandwidth > 0.1 && is_LS == FALSE) {
message(paste0("CRS is in lontitude/latitude and bandwidth is ", bandwidth,
" - bandwidth is calculated in CRS units. Is bandwidth in correct unit?"))
}
dist_lon = geosphere::distm(c(extent[1], extent[3]), c(extent[2], extent[3]),
fun = geosphere::distHaversine)
dist_lat = geosphere::distm(c(extent[1], extent[3]), c(extent[1], extent[4]),
fun = geosphere::distHaversine)
# number of cells
x_cells = (dist_lon / cellsize) |> as.integer()
y_cells = (dist_lat / cellsize) |> as.integer()
# empty_rast for units in degrees
grid_rast = terra::rast(crs = terra::crs(data_proj), nrows=y_cells,
ncols=x_cells, extent = extent)
} else {
grid_rast = terra::rast(crs = terra::crs(data_proj), extent = extent,
resolution = cellsize)
}
} else if (!is.null(env_data) && any(class(env_data) == "SpatRaster")){ #if incorrect cellsize and env_data exists
warning("Cellsize invalid or not set - using cellsize from env_data")
if (terra::linearUnits(env_data) == terra::linearUnits(data_proj)){
# project env data with the same cellsize
env_data_proj = terra::project(env_data, terra::crs(data_proj), res = terra::res(env_data)[1])
# if env_data in lat/lon only quadratic cells
} else {
env_data_proj = terra::project(env_data, terra::crs(data_proj))
}
if (is.null(grid_extent)){
# extent modified to preserve cellsize
grid_rast = terra::crop(env_data_proj, extent)
} else {
# if grid_extent specified and missing cellsize grid_extent is prior to
# cellsize and cellsize is slightly modified
grid_rast = env_data_proj
terra::ext(grid_rast) = extent
}
} else {
stop("Invalid or missing cellsize - provide valid cellsize or env_data")
}4.5 Convert SpatVector environmental data
With grid raster computed environmental SpatVector layer is rasterized in env_vect() function utilizing environmental buffer and environmental field arguments. Conversion to SpatRaster is performed on bike routes lines San Diego dataset.
# if env_data is vector data - create optional buffer and rasterize to grid raster
if (!is.null(env_data) && any(class(env_data) == "SpatVector")) {
env_data = env_vect(env = env_data, env_buff = env_buff,
env_field = env_field, grid = grid_rast)
}4.5.1 Env_vect function
Environmental buffer argument acts as tool to adapt SpatVector environmental datasets (points and lines in particular) to exposure calculation. Single positive numeric value functions as distance of buffer around SpatVector geometries. Environmental field acts as a field for rasterization utilizing sum function.
Code
env_vect = function(env, env_buff, env_field, grid){
if (!is.null(env_buff)) { # create buffer around vector data
if (length(env_buff) == 1 && is.numeric(env_buff) && env_buff > 0){
env_proj = terra::project(env, terra::crs(grid))
## TERRA::BUFFER SOMETIMES CRASHES WITH BIG LINE VECTOR DATASETS
#env = terra::buffer(env_proj, env_buff)
## TEMPORARY FIX: USE SF::ST_BUFFER
env = env_proj |> sf::st_as_sf() |> sf::st_buffer(env_buff) |> terra::vect()
} else {
warning("Invalid env_buff argument - ignoring creation of buffer")
}
}
if (!is.null(env_field)){ # choose field to rasterize
env_f_enq = rlang::enquo(env_field)
if (rlang::quo_name(env_f_enq) %in% terra::names(env)){
env = terra::rasterize(env, grid, field = rlang::quo_name(env_f_enq), fun = "sum")
} else {
warning("Invalid env_field argument - env_field is not a column name in data. Ignoring env_field argument")
env = terra::rasterize(env, grid, fun = "sum")
}
} else {
env = terra::rasterize(env, grid, fun = "sum")
}
return(env)
}Buffer around SpatVector bike routes lines San Diego data is calculated.
if (any(class(env_data) != "SpatRaster")){
if (!is.null(env_buff)) { # create buffer around vector data
if (length(env_buff) == 1 && is.numeric(env_buff) && env_buff > 0){
env_data_proj = terra::project(env_data, terra::crs(grid_rast))
## TERRA::BUFFER SOMETIMES CRASHES WITH BIG LINE VECTOR DATASETS
#env_data = terra::buffer(env_data_proj, env_buff)
## TEMPORARY FIX: USE SF::ST_BUFFER
env_data = env_data_proj |> sf::st_as_sf() |> sf::st_buffer(env_buff) |> terra::vect()
} else {
warning("Invalid env_buff argument - ignoring creation of buffer")
}
}
}
Afterwards, buffer SpatVector is rasterized using class environmental field.
if (!is.null(env_field)){ # choose field to rasterize
env_f_enq = rlang::enquo(env_field)
if (rlang::quo_name(env_f_enq) %in% terra::names(env_data)){
env_data = terra::rasterize(env_data, grid_rast, field = rlang::quo_name(env_f_enq), fun = "sum")
} else {
warning("Invalid env_field argument - env_field is not a column name in data. Ignoring env_field argument")
env_data = terra::rasterize(env_data, grid_rast, fun = "sum")
}
} else {
env_data = terra::rasterize(env_data, grid_rast, fun = "sum")
}
4.5.2 Line Segment
Line Segment requires assigning environmental data raster values to grid raster for further environmental exposure layer computation Section 4.9.2.
if (!is.null(env_data)){ # env_data included
# project env_data to grid_rast cellsize
env_data_proj = terra::project(env_data, grid_rast)
# replace vals of grid with env values
terra::values(grid_rast) = terra::values(env_data_proj)
}4.6 Group split
Group split argument enables splitting SpatVector GPS dataset into seperate SpatVectors by any field (day, movement type, participant ID etc.). Further output SpatRaster is calculated for each SpatVector data individually.
if (is.null(group_split)) {
data_iter = list(data_proj) # only one item for for loop
} else {
enq_group_split = rlang::enquo(group_split)
if (rlang::quo_name(enq_group_split) %in% terra::names(data_proj)){
data_iter = terra::split(data_proj, rlang::quo_name(enq_group_split)) # split data_proj by group_split
message(paste0("Data split by group into ", length(data_iter), " items"))
} else {
data_iter = list(data_proj)
warning("Invalid group_split argument - group_split is not a column in data. Data not split")
}
}Data split by group into 8 items
4.7 Activity Space Loop
Following splitting the data, each SpatVector is iterated in first loop. Activity space SpatRasters (PO/KDE/DR) or buffer time data SpatVectors (LS) are computed.
4.7.1 Point Overlay/Kernel Density Estimation/Density Ranking
In the beginning two empty lists destined to store loops results are generated. For extent and list discussion see Section 6.1. Activity space for each TWSA method is evaluated: PO - terra::rasterize(), KDE - spat_kde() and DR - spat_dr() (See Section 4.8). Every activity space SpatRaster is added as consecutive item in act_out list.
##DR - change spat_kde
# dr_rast = spat_dr(data_i, grid_rast, bandwidth)
##PO - change spat_kde
# rast_points = terra::rasterize(data_i, grid_rast, fun = "length")
act_out = list()
output = list()
for (data_i in data_iter){
# if each group should have seperate extent then output is a list rasts
# if all groups should have same extent then output is rast with n layers
### UNCOMMENT IF EVERY RAST SHOULD HAVE SEPERATE EXTENT
if (is.null(grid_extent)) {
#new ext for each group
# get extent
group_extent = terra::ext(data_i)
# new extent - expanded extent by bandwidth
new_group_extent = c(terra::xmin(group_extent) - bandwidth,
terra::xmax(group_extent) + bandwidth,
terra::ymin(group_extent) - bandwidth,
terra::ymax(group_extent) + bandwidth)
# crop ext of each rast
grid_crop = terra::crop(grid_rast, new_group_extent)
kde_rast = spat_kde(data_i, grid_crop, bandwidth)
} else {
kde_rast = spat_kde(data_i, grid_rast, bandwidth)
}
### UNCOMMENT IF EVERY RAST SHOULD HAVE SEPERATE EXTENT
### UNCOMMENT IF EVERY RAST SHOULD HAVE SAME EXTENT
#kde_rast = spat_kde(data_i, grid_rast, bandwidth)
### UNCOMMENT IF EVERY RAST SHOULD HAVE SAME EXTENT
# normalization without groups
if (normalize == TRUE && (norm_group == FALSE || norm_method != "range" || length(data_iter) == 1)){
if (norm_method != "range" && norm_group == TRUE) {
message(paste0('Norm_method is "', norm_method, '" - norm_group is TRUE is applicable only for norm_method "range". Norm group argument ignored. Normalizing each group seperately'))
}
# calculate normalization
terra::values(kde_rast) = BBmisc::normalize(terra::values(kde_rast),
method = norm_method, margin = 2L)
}
kde_rast = terra::subst(kde_rast, from = 0, to = NA)
### UNCOMMENT IF EVERY RAST SHOULD BE A SEPERATE ELEMENT IN LIST
act_out[length(act_out) + 1] = as.list(kde_rast)
### UNCOMMENT IF EVERY RAST SHOULD BE A SEPERATE ELEMENT IN LIST
### UNCOMMENT IF ALL RAST AS STACK RASTER
#act_out = append(act_out, kde_rast)
### UNCOMMENT IF ALL RAST AS STACK RASTER
}
if (normalize == TRUE && norm_method == "range" && norm_group == TRUE && length(data_iter) > 1){
max_val = max(sapply(act_out, terra::minmax)[2,], na.rm = TRUE)
act_out = sapply(act_out, function(x) {
if (!(all(is.na(terra::values(x))))) {
terra::values(x) = BBmisc::normalize(terra::values(x), method = "range",
margin = 2L,
range = c(0, terra::minmax(x)[2] / max_val))
}
return(x)
})
}
| count | area | min | max | range | mean | std | sum | |
|---|---|---|---|---|---|---|---|---|
| PO | 1091 | 2729650.5 | 1.0000000 | 25.0000000 | 24.0000000 | 1.5609533 | 1.9528579 | 1703.0000000 |
| KDE | 19473 | 48720776.9 | 0.0000000 | 0.0019976 | 0.0019976 | 0.0000350 | 0.0001175 | 0.6811803 |
| DR | 6858 | 17158502.8 | 0.0258368 | 0.9994128 | 0.9735760 | 0.2210858 | 0.1874312 | 1516.2066941 |
| PO_20 | 145 | 362787.4 | 1.0000000 | 15.0000000 | 14.0000000 | 1.9586207 | 1.9329301 | 284.0000000 |
| KDE_20 | 2092 | 5234140.3 | 0.0000000 | 0.0015232 | 0.0015232 | 0.0000562 | 0.0001512 | 0.1175879 |
| DR_20 | 622 | 1556233.0 | 0.0816327 | 0.9591837 | 0.8775510 | 0.2381827 | 0.1748179 | 148.1496599 |
| PO_21 | 157 | 392811.8 | 1.0000000 | 14.0000000 | 13.0000000 | 1.4585987 | 1.3472806 | 229.0000000 |
| KDE_21 | 1980 | 4953934.1 | 0.0000000 | 0.0006602 | 0.0006602 | 0.0000469 | 0.0000750 | 0.0928017 |
| DR_21 | 834 | 2086656.6 | 0.0043103 | 0.9612069 | 0.9568966 | 0.2755830 | 0.2262931 | 229.8362069 |
| PO_23 | 105 | 262708.7 | 1.0000000 | 6.0000000 | 5.0000000 | 1.4095238 | 0.8804246 | 148.0000000 |
| KDE_23 | 1344 | 3362671.0 | 0.0000000 | 0.0005180 | 0.0005180 | 0.0000452 | 0.0000699 | 0.0607986 |
| DR_23 | 629 | 1573749.9 | 0.0328947 | 0.9868421 | 0.9539474 | 0.2575412 | 0.2138723 | 161.9934211 |
4.7.2 Groups normalization
With implementation of group split argument two seperate groups normalization approaches are derived. First, normalization without groups is performed within activity space loop. It is applicable to all normalization methods. Each SpatRaster undergoes individual normalization using BBmisc:normalize(). Secondly, normalization with groups is conducted following activity space loop’s completion. SpatRaster with highest maximum value is rescaled to 0-1 range. Residual SpatRasters are rescaled in regards of prior SpatRaster. The latter approach is executed solely utilizing range normalization method. Therefore, whilst normalization method is distinct from range, the former is conducted instead.
if (normalize == TRUE && (norm_group == FALSE || norm_method != "range" || length(data_iter) == 1)){
if (norm_method != "range" && norm_group == TRUE) {
message(paste0('Norm_method is "', norm_method, '" - norm_group is TRUE is applicable only for norm_method "range". Norm group argument ignored. Normalizing each group seperately'))
}
# calculate normalization
terra::values(kde_rast) = BBmisc::normalize(terra::values(kde_rast),
method = norm_method, margin = 2L)
}
| min | max | range | mean | std | sum | |
|---|---|---|---|---|---|---|
| PO_f_20 | 0.0000000 | 1 | 1.0000000 | 0.0684729 | 0.1380664 | 9.928571 |
| KDE_f_20 | 0.0000000 | 1 | 1.0000000 | 0.0369021 | 0.0992412 | 77.199191 |
| DR_f_20 | 0.0851064 | 1 | 0.9148936 | 0.2483182 | 0.1822570 | 154.453901 |
| PO_f_21 | 0.0000000 | 1 | 1.0000000 | 0.0352768 | 0.1036370 | 5.538462 |
| KDE_f_21 | 0.0000001 | 1 | 0.9999999 | 0.0709961 | 0.1136578 | 140.572211 |
| DR_f_21 | 0.0044843 | 1 | 0.9955157 | 0.2867052 | 0.2354260 | 239.112108 |
| PO_f_23 | 0.0000000 | 1 | 1.0000000 | 0.0819048 | 0.1760849 | 8.600000 |
| KDE_f_23 | 0.0000002 | 1 | 0.9999998 | 0.0873281 | 0.1349004 | 117.368978 |
| DR_f_23 | 0.0333333 | 1 | 0.9666667 | 0.2609751 | 0.2167239 | 164.153333 |
if (normalize == TRUE && norm_method == "range" && norm_group == TRUE && length(data_iter) > 1){
max_val = max(sapply(act_out, terra::minmax)[2,], na.rm = TRUE)
act_out = sapply(act_out, function(x) {
if (!(all(is.na(terra::values(x))))) {
terra::values(x) = BBmisc::normalize(terra::values(x), method = "range",
margin = 2L,
range = c(0, terra::minmax(x)[2] / max_val))
}
return(x)
})
}
| min | max | range | mean | std | sum | |
|---|---|---|---|---|---|---|
| PO_t_20 | 0 | 0.8823529 | 0.8823529 | 0.0604173 | 0.1218233 | 8.760504 |
| KDE_t_20 | 0 | 1.0000000 | 1.0000000 | 0.0369021 | 0.0992412 | 77.199187 |
| DR_t_20 | 0 | 0.9591837 | 0.9591837 | 0.1711129 | 0.1910801 | 106.432210 |
| PO_t_21 | 0 | 0.8235294 | 0.8235294 | 0.0290515 | 0.0853481 | 4.561086 |
| KDE_t_21 | 0 | 0.4334178 | 0.4334178 | 0.0307709 | 0.0492613 | 60.926389 |
| DR_t_21 | 0 | 0.9612069 | 0.9612069 | 0.2724946 | 0.2273124 | 227.260485 |
| PO_t_23 | 0 | 0.3529412 | 0.3529412 | 0.0289076 | 0.0621476 | 3.035294 |
| KDE_t_23 | 0 | 0.3400875 | 0.3400875 | 0.0296991 | 0.0458780 | 39.915647 |
| DR_t_23 | 0 | 0.9868421 | 0.9868421 | 0.2323929 | 0.2212472 | 146.175136 |
4.7.3 Line segment
In the beginning, two empty lists destined to store loops results are generated. Selected extent and output option will be included as Section 6.1 discussion is resolved. Line Segment activity space loop is disparate from PO/KDE/DR loop. Line Segment method’s output is a list of buffer SpatVectors data. Additionally, it follows specific workflow Section 4.8.4. Here, time elapsed is calculated from time data and stored as attribute in buffer SpatVector. Line Segment method offers two ways of normalization (time_only_data argument) further discussed in Section 6.2.
buff_list = list()
output = list()
for (data_i in data_iter) {
# create line segments from spatial points
trajectories = terra::vect(trajectories_fun(data_i))
# create buffers over line segments
traj_buff = trajectories |>
tidyterra::select(line_id) |>
terra::buffer(bandwidth)
traj_buff$act = 1
if (!is.null(time_data)){
#time_data_null = dplyr::enquo(time_data)
duration_line_id = data_i |>
dplyr::mutate(time_elapsed = as.numeric(difftime(dplyr::lead(.data[[time_cname]]),
.data[[time_cname]], units = time_unit)),
line_id = dplyr::row_number()) |>
dplyr::select(line_id, time_elapsed)
# last NA value set to 0 for normalization to not set lowest time value to 0
duration_line_id$time_elapsed[nrow(duration_line_id)] = 0
#1st type of normalize - normalize time_elapsed without spatial reference
if(normalize == TRUE && time_only_norm == TRUE &&
(norm_group == FALSE || norm_method != "range" || length(data_iter) == 1)) {
if (norm_method != "range" && norm_group == TRUE) {
message(paste0('Norm_method is "', norm_method, '" - norm_group is TRUE is applicable only for norm_method "range". Norm group argument ignored. Normalizing each group seperately'))
}
duration_line_id$time_elapsed = BBmisc::normalize(duration_line_id$time_elapsed, method = norm_method, margin = 1L)
}
duration_df = as.data.frame(duration_line_id)
traj_buff = dplyr::left_join(traj_buff, duration_df, by = 'line_id')
traj_buff = traj_buff[,-2]
names(traj_buff)[2] = "act"
}
if(normalize == TRUE && (time_only_norm == FALSE || is.null(time_data)) &&
(norm_group == FALSE || length(data_iter) == 1)){
#2nd type of normalize - normalize time_elapsed or buffers using spatial reference - rasterizing and reading min max data from raster
if (norm_method != "range") {
message(paste0('Norm_method is "', norm_method, '" - time_only_norm is FALSE or time_data is missing. This type of time_only normalization is applicable only for norm_method "range". Setting norm_method to "range"'))
}
# slightly varying output values (max very close to 1)
norm_rast = terra::rasterize(traj_buff, grid_rast, field = "act", fun = "sum")
#read max val
max_norm = terra::minmax(norm_rast)[2]
# normalize
if (length(unique(traj_buff$act)) == 1){
#if vector is constant adjust BBmisc
traj_buff$act = BBmisc::normalize(traj_buff$act, method = "range",
margin = 1L, range = c(0, max(traj_buff$act) / max_norm * 2))
} else{
traj_buff$act = BBmisc::normalize(traj_buff$act, method = "range",
margin = 1L, range = c(0, max(traj_buff$act) / max_norm))
}
}
buff_list[length(buff_list) + 1] = list(traj_buff)
}
Line Segment group normalization is conducted analogously to PO, KDE and DR activity space loop, with the exception of Line Segment adjustment. Rather than normalizing SpatRaster values, time elapsed attribute is processed.
if(normalize == TRUE && time_only_norm == TRUE && !missing(time_data) &&
norm_group == TRUE && length(data_iter) > 1 && norm_method == "range"){
# 1st normalize variant for all rasts
if (length(data_iter) > 1){
max_buff = (do.call(rbind, buff_list))$act |> max(na.rm = TRUE)
buff_list = sapply(buff_list, function(x) {
if (!(all(is.na(x$act)))) {
x$act = BBmisc::normalize(x$act, method = "range",
margin = 2L,
range = c(0, max(x$act, na.rm = TRUE) / max_buff))
}
return(x)
})
} else {
max_buff = buff_list[[1]]$act |> max(na.rm = TRUE)
buff_list = sapply(buff_list, function(x) {
if (!(all(is.na(x$act)))) {
x$act = BBmisc::normalize(x$act, method = "range",
margin = 2L,
range = c(0, max(x$act, na.rm = TRUE) / max_buff))
}
return(x)
})
}
} else if (normalize == TRUE && (time_only_norm == FALSE || missing(time_data)) &&
norm_group == TRUE && length(data_iter) > 1 && norm_method == "range"){
# 2nd normalize variant for all rasts
norm_list = sapply(buff_list, terra::rasterize, grid_rast, field = "act", fun = "sum")
max_multiple_norm = max(sapply(norm_list, terra::minmax)[2,], na.rm = TRUE)
buff_list = sapply(buff_list, function(x) {
if (!(all(is.na(x$act)))) {
x$act = BBmisc::normalize(x$act, method = "range",
margin = 2L,
range = c(0, max(x$act, na.rm = TRUE) / max_multiple_norm))
}
return(x)
})
}4.8 Method specific calculations
4.8.1 Point Overlay
Activity space of Point Overlay method is evaluated by converting points vector data to raster using length function.
rast_points = terra::rasterize(data_i, grid_rast, fun = "length")4.8.2 Kernel Density Estimation
The spat_kde() implemented in exposure_KDE() utilizes kde() function. GPS SpatVector points (x), grid rast (ref) and bandwidth (bw) are processed for the purpose of computing Kernel Density Estimation SpatRaster.
Code
spat_kde = function(x, ref, bw){
# coords of each ref cell center
gx = terra::crds(ref)[,1] |> unique()
gy = terra::crds(ref)[,2] |> unique()
# distance from each x/y ref cell center to each x/y x points (squared)
kde_val = kde(x, gx, gy, bw)
output_list = list(x = gx, y = gy, z = kde_val)
# create df of x/y coords and val
pts = data.frame(expand.grid(x = output_list$x, y = output_list$y),
z = as.vector(array(output_list$z, length(output_list$z))))
# create points SpatVector
pts = terra::vect(pts, geom = c("x", "y"))
# rasterize to ref
return(terra::rasterize(pts, ref, field = "z"))
}The kde() function produces KDE exposure matrix from SpatVector points, two numeric vectors indicating columns and rows coordinates of SpatRaster and bandwidth. With the intention of optimazing memory efficiency, computation of each column is performed seperately.
Code
kde = function(points, ref_uq_x, ref_uq_y, bw) {
ax <- outer(ref_uq_x, terra::crds(points)[, 1], "-" ) ^ 2
ay <- outer(ref_uq_y, terra::crds(points)[, 2], "-" ) ^ 2
# points within quadratic search radius
ax_T = ax <= bw ^ 2
ay_T = ay <= bw ^ 2
# every x row index
positions = matrix(1:nrow(ax), ncol = 1)
# calculate KDE for each column seperately
density_mx = apply(positions, 1, function(xcol) {
# which point's x coord within possible search radius
cols_T = which(ax_T[xcol,] == TRUE)
# matrix cell coordinates of points within quadratic search radius
rows_T = which(ay_T[,cols_T] == TRUE, arr.ind = TRUE)
if (length(cols_T) <= 1){ # if only one or zero points within quadratic search radius
suppressWarnings({rows_T = rows_T |> array(dim = c(length(rows_T), 1)) |> cbind(1)})
colnames(rows_T) = c("row", "col")
}
# calculate distance within quadratic search radius
df_row = rows_T |> as.data.frame() |>
dplyr::mutate(sum_val = ay[cbind(row, cols_T[col])] + ax[cbind(xcol, cols_T[col])]) |>
dplyr::filter(sum_val <= bw ^ 2) |> # filter points within search radius based on real distance
dplyr::mutate(sum_val = (sum_val / bw ^ 2 * (-1) + 1) ^ 2) # calculate impact of each point on cells
#create empty matrix
empty_mx = matrix(NA, nrow = nrow(ay), ncol = length(cols_T))
# insert values based on cell coordinates
empty_mx[cbind(df_row$row, df_row$col)] = df_row$sum_val
# sum impact of every point to calculate KDE
mx_sum = rowSums(empty_mx, na.rm = TRUE)
return(mx_sum)})
# transpose matrix and final KDE calculations
out = t(density_mx) * (3/pi) / bw ^ 2
return(out)
}Calculations are conducted on seven sample GPS Geolife Data SpatVector points located in center of San Diego and cropped to sample GPS data extent 5 x 5 grid raster.
#example ref grid and points
exp_point = geolife_sandiego |> filter(dateTime %in% c("2011-08-18 02:27:11", "2011-08-19 04:24:54", "2011-08-21 03:57:00", "2011-08-18 03:16:43", "2011-08-18 01:26:31", "2011-08-21 04:24:11", "2011-08-21 03:03:14")) |> vect(geom = c("lon", "lat"), crs = "EPSG:4326") |> project(crs(data_proj))
ref_exp = crop(grid_rast, ext(exp_point))X coordinates matrix
| x | y |
|---|---|
| 484833.2 | 3619702 |
| 484817.4 | 3619649 |
| 484957.7 | 3619852 |
| 484988.0 | 3619654 |
| 484856.3 | 3619633 |
| 484761.3 | 3619643 |
| 484884.0 | 3619700 |
\begin{bmatrix} 484763 &484813 &484863 &484913 &484963 \\484763 &484813 &484863 &484913 &484963 \\484763 &484813 &484863 &484913 &484963 \\484763 &484813 &484863 &484913 &484963 \\484763 &484813 &484863 &484913 &484963 \\ \end{bmatrix}
Y coordinates matrix
\begin{bmatrix} 3619850 &3619850 &3619850 &3619850 &3619850 \\3619800 &3619800 &3619800 &3619800 &3619800 \\3619750 &3619750 &3619750 &3619750 &3619750 \\3619700 &3619700 &3619700 &3619700 &3619700 \\3619650 &3619650 &3619650 &3619650 &3619650 \\ \end{bmatrix}
Column and rows coordinates are retrieved from ref SpatRaster. Subsequently, distance matrixes from every SpatVector point to each column and row is computed.
#points to row and col distance matrixes
gx = terra::crds(ref_exp)[,1] |> unique()
gy = terra::crds(ref_exp)[,2] |> unique()
# distance matrixes
ax = abs(outer(gx, terra::crds(exp_point)[, 1], "-" ))
ay = abs(outer(gy, terra::crds(exp_point)[, 2], "-" ))
# used in further calculation
ax <- outer(gx, terra::crds(exp_point)[, 1], "-" ) ^ 2
ay <- outer(gy, terra::crds(exp_point)[, 2], "-" ) ^ 2Distance matrix from SpatVector points to columns
\begin{bmatrix} 70 &54 &195 &225 &93 &2 &121 \\20 &4 &145 &175 &43 &52 &71 \\30 &46 &95 &125 &7 &102 &21 \\80 &96 &45 &75 &57 &152 &29 \\130 &146 &5 &25 &107 &202 &79 \\ \end{bmatrix}
Distance matrix from SpatVector points to rows
\begin{bmatrix} 148 &201 &2 &196 &217 &207 &150 \\98 &151 &52 &146 &167 &157 &100 \\48 &101 &102 &96 &117 &107 &50 \\2 &51 &152 &46 &67 &57 &0 \\52 &1 &202 &4 &17 &7 &50 \\ \end{bmatrix}
SpatVector points within search radius (bandwidth) are indicated. Based on distance matrix boolean matrix is derived.
# matrixes with points within search radius T/F
ax_T = ax <= 100
ay_T = ay <= 100
# used in further calculation
ax_T = ax <= 100 ^ 2
ay_T = ay <= 100 ^ 2Search radius of columns matrix
\begin{bmatrix} TRUE &TRUE &FALSE &FALSE &TRUE &TRUE &FALSE \\TRUE &TRUE &FALSE &FALSE &TRUE &TRUE &TRUE \\TRUE &TRUE &TRUE &FALSE &TRUE &FALSE &TRUE \\TRUE &TRUE &TRUE &TRUE &TRUE &FALSE &TRUE \\FALSE &FALSE &TRUE &TRUE &FALSE &FALSE &TRUE \\ \end{bmatrix}
Search radius of rows matrix
\begin{bmatrix} FALSE &FALSE &TRUE &FALSE &FALSE &FALSE &FALSE \\TRUE &FALSE &TRUE &FALSE &FALSE &FALSE &FALSE \\TRUE &FALSE &FALSE &TRUE &FALSE &FALSE &TRUE \\TRUE &TRUE &FALSE &TRUE &TRUE &TRUE &TRUE \\TRUE &TRUE &FALSE &TRUE &TRUE &TRUE &TRUE \\ \end{bmatrix}
Calculations are performed on every column seperately. Retrieve points within search radius of current column and all rows and generate matrix of retrieved points coordinates in rows boolean matrix.
| row | col | sum_val |
|---|---|---|
| 2 | 1 | 120.624429 |
| 3 | 1 | 85.065152 |
| 4 | 1 | 70.156307 |
| 5 | 1 | 87.222329 |
| 4 | 2 | 74.716808 |
| 5 | 2 | 54.413731 |
| 4 | 3 | 114.849433 |
| 5 | 3 | 94.808043 |
| 4 | 4 | 56.811929 |
| 5 | 4 | 7.005758 |
xcol = 1
# select points with x coord within possible search radius to single column
cols_T = which(ax_T[xcol, ] == TRUE)
## [1] 1 2 5 6
# matrix cell coordinates of points within quadratic search radius
rows_T = which(ay_T[,cols_T] == TRUE, arr.ind = TRUE)Compute euclidean distance and indicate points within euclidean search radius. Points beyond search radius are removed.
| row | col | sum_val |
|---|---|---|
| 3 | 1 | 85.065152 |
| 4 | 1 | 70.156307 |
| 5 | 1 | 87.222329 |
| 4 | 2 | 74.716808 |
| 5 | 2 | 54.413731 |
| 5 | 3 | 94.808043 |
| 4 | 4 | 56.811929 |
| 5 | 4 | 7.005758 |
dist = rows_T |> as.data.frame() |>
dplyr::mutate(sum_val = ay[cbind(row, cols_T[col])] + ax[cbind(xcol, cols_T[col])])
#filter search radius again
dist_filt = dist |> dplyr::filter(sum_val <= 100 ^ 2)Further, impact of selected points on value of each cell is evaluated.
# kde calculations
df_row = dist_filt |> dplyr::mutate(sum_val = (sum_val / 100 ^ 2 * (-1) + 1) ^ 2)| row | col | sum_val |
|---|---|---|
| 3 | 1 | 0.0763925 |
| 4 | 1 | 0.2578702 |
| 5 | 1 | 0.0572293 |
| 4 | 2 | 0.1951341 |
| 5 | 2 | 0.4954957 |
| 5 | 3 | 0.0102300 |
| 4 | 4 | 0.4586547 |
| 5 | 4 | 0.9902080 |
Empty matrix with ay number of rows and indicated points number of columns is generated. Point impact values are inserted into empty matrix.
#create empty matrix
empty_mx = matrix(NA, nrow = nrow(ay), ncol = length(cols_T))
# insert values based on cell coordinates
empty_mx[cbind(df_row$row, df_row$col)] = df_row$sum_valMatrix of every column impacting point
\begin{bmatrix} NA &NA &NA &NA \\NA &NA &NA &NA \\0.08 &NA &NA &NA \\0.26 &0.2 &NA &0.46 \\0.06 &0.5 &0.01 &0.99 \\ \end{bmatrix}
Rows of matrix are summed to evaluate entire points impact on each cell.
# sum all points to calc cell value
mx_sum = rowSums(empty_mx, na.rm = TRUE)\begin{bmatrix} 0 \\0 \\0.076 \\0.912 \\1.553 \\ \end{bmatrix}
Evaluation is repeated for every column in ref SpatRaster computing values for all cells.
\begin{bmatrix} 0 &0 &0.01 &0.64 &0.99 \\0 &0 &0 &0.28 &0.53 \\0.08 &0.59 &0.96 &0.46 &0.02 \\0.91 &2.01 &2.32 &1.07 &0.67 \\1.55 &2.68 &2.47 &1.07 &0.89 \\ \end{bmatrix}
Final KDE calculations are performed and output matrix is returned.
out_kde = t(density_mx) * (3/pi) / 100 ^ 2\begin{bmatrix} 0 &0 &0.000001 &0.000061 &0.000095 \\0 &0 &0 &0.000027 &0.00005 \\0.000007 &0.000056 &0.000091 &0.000044 &0.000002 \\0.000087 &0.000192 &0.000222 &0.000102 &0.000064 \\0.000148 &0.000256 &0.000236 &0.000102 &0.000085 \\ \end{bmatrix}
Matrix values are converted into ref SpatRaster.
output_list = list(x = gx, y = gy, z = out_kde)
# create df of x/y coords and val
pts = data.frame(expand.grid(x = output_list$x, y = output_list$y),
z = as.vector(array(output_list$z, length(output_list$z))))
# create points SpatVector
pts = terra::vect(pts, geom = c("x", "y"))
# rasterize to ref
out_rast = terra::rasterize(pts, ref_exp, field = "z")
out_rast = terra::subst(out_rast, from = 0, to = NA)4.8.3 Density Ranking
The spat_dr() function implemented in exposure_DR() uses kde() and kde_points() functions to process GPS points (x), grid raster (ref) and bandwidth (bw) into Density Ranking SpatRaster. It utilizes stats::ecdf() to calculate DR from KDE ref raster values and KDE points values.
Code
spat_dr = function(x, ref, bw) {
gx = terra::crds(ref)[,1] |> unique()
gy = terra::crds(ref)[,2] |> unique()
# distance from each x/y ref cell center to each x/y x points (squared)
kde_val = kde(x, gx, gy, bw)
dr_val = kde_points(terra::crds(x)[,1], terra::crds(x)[,2], bw)
# DR EVAL
dr_calc = stats::ecdf(dr_val)(as.vector(kde_val))
output_list = list(x = gx, y = gy, z = dr_calc)
# create df of x/y coords and val
pts = data.frame(expand.grid(x = output_list$x, y = output_list$y),
z = output_list$z)
# create points SpatVector
pts = terra::vect(pts, geom = c("x", "y"))
# rasterize to ref
return(terra::rasterize(pts, ref, field = "z"))
}The kde_points() calculates KDE of every SpatVector point (instead of cells as in kde()). Function utilizes two vectors of SpatVector point’s x and y coordinates and bandwidth. Similarly to kde(), kde_points() computation is segmented into seperate points to optimize memory efficiency. Calculations are executed on example data produced beforehand in Listing 6.
Code
kde_points = function(vect_x, vect_y, bw){
# apply for each cell coords seperately
kde_p = mapply(function(x, y) {
#distance betweeen each point
ax_p = (vect_x - x) ^ 2
ay_p = (vect_y - y) ^ 2
# boolean if dist within search radius
ax_p_T = ax_p <= bw ^ 2
ay_p_T = ay_p <= bw ^ 2
# choose dist^ 2 within search radius
x_T = which(ax_p_T == TRUE)
y_T = which(ay_p_T == TRUE)
# index of coords of which both x and y within search radius
xy_T = intersect(x_T, y_T)
# calculate dist ^ 2 and choose thoso within search radius
xy_val = ax_p[xy_T] + ay_p[xy_T]
xy_val = xy_val[xy_val < bw ^ 2]
#
xy_calc = (xy_val / bw ^ 2 * (-1) + 1) ^ 2
xy_out = sum(xy_calc)
}, x = vect_x, y = vect_y) * (3/pi) / bw ^ 2
return(kde_p)
}# same example points as in kde()
vect_x = terra::crds(exp_point)[,1]
## [1] 484833.2 484817.4 484957.7 484988.0 484856.3 484761.3 484884.0
vect_y = terra::crds(exp_point)[,2]
## [1] 3619702 3619649 3619852 3619654 3619633 3619643 3619700First pair of coordinates is retrieved.
xi = vect_x[1]
## [1] 484833.2
yi = vect_y[1]
## [1] 3619702Initially, distance between current point and each point’s x and y coordinates is derived.
# distance between each point's row and column
abs(vect_x - xi)
## [1] 0.00000 15.73158 124.58766 154.83885 23.13644 71.87595 50.88448
abs(vect_y - yi)
## [1] 0.000000 53.075746 150.130624 48.124734 68.876543 58.643418 1.962094
# used in further calculation
ax_p = (vect_x - xi) ^ 2
ay_p = (vect_y - yi) ^ 2Indicate points within search radius (bandwidth). Based on point distance vector boolean vector is derived.
# points within search radius in x and y
ax_p_T = ax_p <= 100 ^ 2
## [1] TRUE TRUE FALSE FALSE TRUE TRUE TRUE
ay_p_T = ay_p <= 100 ^ 2
## [1] TRUE TRUE FALSE TRUE TRUE TRUE TRUEAfterwards, retrieve points within search radius of current point x and y coordinates and generate vector of retrieved points indexes in x and y coordinates vectors.
# intersect x and y to get indexes of point coords within search radius
x_T = which(ax_p_T == TRUE)
## [1] 1 2 5 6 7
y_T = which(ay_p_T == TRUE)
## [1] 1 2 4 5 6 7
xy_T = intersect(x_T, y_T)
## [1] 1 2 5 6 7Compute euclidean distance and indicate points within euclidean search radius. Points beyond search radius are removed.
# calc dist and search radius filter again
xy_val = sqrt(ax_p[xy_T] + ay_p[xy_T])
## [1] 0.00000 55.35808 72.65860 92.76423 50.92230
xy_val[xy_val < 100]
## [1] 0.00000 55.35808 72.65860 92.76423 50.92230
# used in further calculation
xy_val = ax_p[xy_T] + ay_p[xy_T]
xy_val = xy_val[xy_val < 100 ^ 2]Process impact of points calculations and sum all values to compute entire points impact on current point.
# additional calc and sum all rows
xy_calc = (xy_val / 100 ^ 2 * (-1) + 1) ^ 2
## [1] 1.00000000 0.48100918 0.22285265 0.01945458 0.54862461
xy_out = sum(xy_calc)
## [1] 2.271941Evaluation is repeated for all points and final KDE calculations are performed.
kde_p
## [1] 2.271941 2.711617 1.000000 1.000000 2.135161 1.491677 1.861494
kde_p_out = kde_p * (3/pi) / 100 ^ 2
## [1] 0.00022 0.00026 0.00010 0.00010 0.00020 0.00014 0.00018Formerly computated grid KDE and point KDE are utilized in Density Ranking using Empiracal Cumulative Distribution Function, ranking KDE values.
dr_calc = stats::ecdf(kde_p_out)(as.vector(out_kde))
## [1] 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.57 0.86 0.29 0.00
## [21] 0.43 0.86 0.86 0.29 0.00\begin{bmatrix} 0 &0 &0 &0 &0 \\0 &0 &0 &0 &0 \\0 &0 &0 &0 &0 \\0 &0.57 &0.86 &0.29 &0 \\0.43 &0.86 &0.86 &0.29 &0 \\ \end{bmatrix}
Vector values are converted into ref SpatRaster.
output_dr_list = list(x = gx, y = gy, z = dr_calc)
# create df of x/y coords and val
pts_dr = data.frame(expand.grid(x = output_dr_list$x, y = output_dr_list$y),
z = output_dr_list$z)
# create points SpatVector
pts_dr = terra::vect(pts_dr, geom = c("x", "y"))
# rasterize to ref
dr_out_rast = terra::rasterize(pts_dr, ref_exp, field = "z")
dr_out_rast = terra::subst(dr_out_rast, from = 0, to = NA)4.8.4 Line Segment
The expoosure_LS() function applies trajectories_fun() to evaluate trajectories from every two GPS data SpatVector points (ordered).
#Line segment trajectories
trajectories_fun = function(data){
t_name = c(names(data))[grepl('time', c(names(data)))]
lst_name = paste(t_name[which.max(nchar(t_name))], "_1", sep = "")
trajectories_out = data |>
sf::st_as_sf() |> # change to sf object to change points to linestring later
#filter to the study id defined by the argument
#define start and end points of line
dplyr::mutate(
line_id = dplyr::row_number(),#an id for each "line segment"
x_start= sf::st_coordinates(geometry)[,1],
y_start= sf::st_coordinates(geometry)[,2],
x_end = dplyr::lead(x_start),
y_end = dplyr::lead(y_start)
) |>
dplyr::ungroup() |>
sf::st_set_geometry(NULL) |>
#exclude the last observation, which has no "lead", and will be missing.
dplyr::filter(is.na(x_end)==FALSE) |>
# Make the data long form so that each point has two observations
tidyr::pivot_longer(
# select variables to pivot longer.
cols = c(x_start, y_start, x_end, y_end),
#value goes to "x/y", and time goes to "_start/end"
names_to = c(".value", lst_name),
names_repair = "unique",
names_sep = "_"#the separator for the column name
) |>
# create sf object once again
sf::st_as_sf(coords = c("x", "y"), crs= sf::st_crs(data)) |>
dplyr::group_by(line_id) |>
#see Edzer's answer here:https://github.com/r-spatial/sf/issues/851
#do_union=FALSE is needed.
dplyr::summarize(do_union = FALSE) |>
sf::st_cast("LINESTRING") |> # cast linestring type
sf::st_as_sf() |>
dplyr::ungroup()
return(trajectories_out)
}Subsequently, buffer around all trajectories is evaluated, later applied in normalization and output SpatRaster calculations.
# create line segments from spatial points
trajectories = terra::vect(trajectories_fun(data_i))
# create buffers over line segments
traj_buff = trajectories |>
tidyterra::select(line_id) |>
terra::buffer(bandwidth)
traj_buff$act = 14.9 Environmental exposure loop
4.9.1 Point Overlay/Kernel Density Estimation/Density Ranking
Second, environmental exposure, loop for calculating environmental exposure is implemented. Environmental exposure layer is computed by multiplying activity space layer and environmental data layer.
for (out in act_out) {
if (!is.null(env_data)){ # calculate exposure
# project env_data to grid
env_data_proj = terra::project(env_data, out)
# if same ext for groups out -> grid_rast and this code chunk out of loop
#env_data_resamp = terra::resample(env_data_proj, out)
rast_env_kde = out * env_data_proj
end_rast = rast_env_kde
} else {
end_rast = out
}
if (length(data_iter) == 1) { # if one group or no group output is not a list
output = end_rast
} else {
output[length(output) + 1] = as.list(end_rast)
}
}4.9.2 Line Segment
Computing environmental exposure layer in Line Segment method utilizes grid raster with assigned environmental data values. Grid raster is extracted to every SpatVector trajectory buffer, converted to one data frame and summarized by area overlap for each segment afterwards. Summarized exposure data is joined with SpatVector trajectory buffer and multiplied by normalized or not normalized time weights. Finally, vector data is converted to raster based on exposure and time weights with sum function. Activity space calculation bypasses extraction of environmental values and rasterizes time weights to grid raster using sum function.
for (buff in buff_list) {
if (!missing(env_data)){ #all of computation necessary only when calculating env_data
# extract values and weights (area overlap) from grid for each cell of buffer
exac_extr = exactextractr::exact_extract(grid_rast, sf::st_as_sf(buff), progress = FALSE)
traj_extract = Map(function(x, id) {x$line_id = id
return(x)}, x = exac_extr, seq_along(exac_extr)) |> dplyr::bind_rows() |>
dplyr::as_tibble() |>
dplyr::relocate(line_id) |>
dplyr::rename(e = 2)
# calculate summarised exposure for each buffer
traj_extract_line_id = traj_extract |>
dplyr::group_by(line_id) |>
dplyr::summarise(
#Jan 9, 2024 use R's built-in weighted.mean() function
#instead of calculating weighted average manually
end_weights=stats::weighted.mean(
x=e,
w=coverage_fraction,
na.rm=TRUE),
#These weights are based on the areal overlap, not time
sum_weights = sum(coverage_fraction,na.rm=TRUE),
n_pixel = dplyr::n() # number of observations corresponds to number of pixels per line segment
) |>
dplyr::ungroup()
# join weights with spatial buffer
weight_buff = dplyr::inner_join(buff, traj_extract_line_id, by = 'line_id')
weight_buff$end_weights = weight_buff$end_weights * weight_buff$act
# rasterize results
rast_segment = terra::rasterize(weight_buff, grid_rast, field = "end_weights", fun = "sum")
} else {
rast_segment = terra::rasterize(buff, grid_rast, field = "act", fun = "sum")
}
if (length(data_iter) == 1) { # if one group or no group output is not a list
output = rast_segment
} else {
output[length(output) + 1] = as.list(rast_segment)
}
}| line_id | e | coverage_fraction |
|---|---|---|
| 1 | 0.1509941 | 0.2779211 |
| 1 | 0.3074401 | 0.8561569 |
| 1 | 0.2196989 | 1.0000000 |
| 1 | 0.2476714 | 1.0000000 |
| 1 | 0.2691399 | 0.8558987 |
| 1 | 0.2917317 | 0.2785110 |
| 1 | 0.0974092 | 0.2394574 |
| 1 | 0.0545777 | 0.9830808 |
| 1 | 0.1115729 | 1.0000000 |
| 1 | 0.2017748 | 1.0000000 |
| line_id | end_weights | sum_weights | n_pixel |
|---|---|---|---|
| 1 | 0.1818593 | 104.5210 | 127 |
| 2 | 0.1825086 | 107.2911 | 132 |
| 3 | 0.1844797 | 107.2188 | 135 |
| 4 | 0.1678028 | 105.7782 | 135 |
| 5 | 0.1756642 | 109.1286 | 137 |
| 6 | 0.2133048 | 111.2900 | 142 |
| 7 | 0.1915245 | 109.1738 | 137 |
| 8 | 0.1610415 | 110.9404 | 138 |
| 9 | 0.1730885 | 110.1454 | 137 |
| 10 | 0.1926440 | 103.0547 | 128 |
4.10 Writing to filepath
Splitting SpatVector data imposes adjusting filepaths for writing multiple output SpatRasters. In scenarios where number of groups is greater than 1 unique filepaths based on set path are created.
if (length(data_iter) > 1 && !is.null(filepath)) {
# list output
# for iterating file names
file_ext = stringi::stri_extract(filepath, regex = "\\.(\\w+)$")
file_no_ext = substr(filepath, 1, nchar(filepath) - nchar(file_ext))
file_vect = rep(file_no_ext, length.out = length(data_iter))
file_vect = paste0(file_vect, "_", seq_along(file_vect), file_ext)
}
# for filepath = "raster.tif"
file_vect
## [1] "raster_1.tif" "raster_2.tif" "raster_3.tif" "raster_4.tif" "raster_5.tif" "raster_6.tif" "raster_7.tif" "raster_8.tif"Write every output SpatRaster to filepaths defined beforehand.
if (!is.null(filepath)) { # save raster
if (length(data_iter) > 1){ # update filepath
mapply(function(x, y) {
terra::writeRaster(x, filename = y)
message(paste0("Saving output to ", y))
}, x = output, y = file_vect)
} else {
terra::writeRaster(end_rast, filename = filepath)
message(paste0("Saving output to ", filepath))
}
}4.11 Output
Output of twsagps package functions is a single SpatRaster or a list of activity space or environmental exposure SpatRasters.
| count | area | min | max | range | mean | std | sum | |
|---|---|---|---|---|---|---|---|---|
| PO_ndvi | 1091 | 2729650 | -0.4050649 | 1.0608427 | 1.4659077 | 0.1067366 | 0.1088869 | 116.4496332 |
| KDE_ndvi | 19473 | 48720777 | -0.0000137 | 0.0002946 | 0.0003083 | 0.0000028 | 0.0000074 | 0.0549635 |
| DR_ndvi | 6858 | 17158503 | -0.0561597 | 0.1809338 | 0.2370935 | 0.0183426 | 0.0196866 | 125.7934349 |
| LS_ndvi | 20904 | 52301083 | -0.0810223 | 527.6416016 | 527.7226238 | 2.5270035 | 26.2159897 | 52824.4801867 |
| count | area | min | max | range | mean | std | sum | |
|---|---|---|---|---|---|---|---|---|
| PO_bike | 277 | 693047.3 | 1.0000000 | 84.0000000 | 83.0000000 | 7.2815884 | 9.4647429 | 2017.0000000 |
| KDE_bike | 2924 | 7315749.8 | 0.0000000 | 0.0139834 | 0.0139834 | 0.0002593 | 0.0008235 | 0.7581425 |
| DR_bike | 1391 | 3480242.4 | 0.0299472 | 9.0428655 | 9.0129184 | 1.2824664 | 1.3475486 | 1783.9107457 |
| LS_bike | 9803 | 24526747.7 | 0.1850707 | 29764.5585938 | 29764.3735231 | 299.7464519 | 2131.9429532 | 2938414.4679463 |
5 Raster statistics
Function rast_stats() calculates raster statistics utilizing terra::global(), terra::freq() and terra::expanse() functions. Statistics for any number of rasters are apposed in data frame.
Code
# Statistics calculation function
rast_stats = function(..., stats, row_names = NULL){
elipsis_list = list(...)
el_class = sapply(elipsis_list, class)
if (!all(el_class == "SpatRaster")){
if (any(el_class == "SpatRaster")){
warning("Skipping not SpatRaster class arguments")
elipsis_list = elipsis_list[el_class == "SpatRaster"]
} else {
stop("None of arguments is SpatRaster class")
}
}
if (any(sapply(elipsis_list, function(x) {terra::crs(x)}) == "") && "area" %in% stats) { # when
warning("Empty crs of one of arguments - area statistic will not be calculated")
stats = stats[!stats == "area"]
}
if (missing(stats) || length(stats) == 0){
stop("No statistic to be calculated")
}
df = data.frame(matrix(ncol = length(stats)))[-1,]
for (raster in elipsis_list){
row_df = data.frame(matrix(NA, ncol=1, nrow=1))[-1]
for (statistic in stats){
column = switch(
statistic,
range = {range = terra::global(raster, fun = "range", na.rm = TRUE)
val = range[2] - range[1]
colnames(val) = "range"
val},
count = {raster |> terra::freq() |> dplyr::summarise(count = sum(count))},
area = {raster |> terra::expanse() |> dplyr::select(area)},
terra::global(raster, fun = statistic, na.rm = TRUE)
)
row_df = cbind(row_df, column)
}
df = rbind(df, row_df)
}
if (is.null(row_names)){
el_names = names(elipsis_list)
names_repl = which(el_names == "")
el_names[names_repl] = names_repl
rownames(df) = el_names
} else if (length(row_names) == length(elipsis_list)) {
rownames(df) = row_names
} else {
warning("Incorrect length of row_names argument - default output row names")
}
return(df)
}Entirety of statistics tables are calculated using rast_stats() (Table 2, Table 3, Table 4, Table 8, Table 7).
6 Package issues
6.1 Extent and output
Splitting SpatVector GPS data introduces issues regarding SpatRaster extent and output format. Whilst considering extent with multiple SpatRaster, it may be adressed doubly. SpatRasters has various extents or all SpatRasters maintain identical extent. First, various extents are calculated by cropping grid raster to the minimal feasible extent, preserving grid raster cellsize and converting SpatVector data to cropped grid raster. It is solely overwritten by grid extent argument. Output of this approach must be a list of SpatRasters. Additionally, it enables avoiding disproportion between broad extents and data with clustered spatial distribution.
# calculate seperate extents
if (missing(grid_extent)) {
#new ext for each group
# get extent
group_extent = terra::ext(data_i)
# new extent - expanded extent by bandwidth
new_group_extent = c(terra::xmin(group_extent) - bandwidth,
terra::xmax(group_extent) + bandwidth,
terra::ymin(group_extent) - bandwidth,
terra::ymax(group_extent) + bandwidth)
# crop ext of each rast
grid_crop = terra::crop(grid_rast, new_group_extent)
kde_rast = spat_kde(data_i, grid_crop, bandwidth) # method specific calculation
} else {
kde_rast = spat_kde(data_i, grid_rast, bandwidth) # method specific calculation
}
# add to list
act_out[length(act_out) + 1] = as.list(kde_rast)Secondly, identical extent across every SpatRaster is accomplished by rasterizing all SpatVectors to grid raster without cropping. Uniform extents allows implementing alternative output type. Instead of output list of several items, output is generated as singular SpatRaster with multiple layers. Second approach proposes significantly more cohesive output, facilitating usage of data and allowing saving it to a single file.
# calculate same extent
kde_rast = spat_kde(data_i, grid_rast, bandwidth) # method specific calculation
# add SpatRaster as another layer
act_out = append(act_out, kde_rast)
# all items iterated
act_out
## class : SpatRaster
## dimensions : 1607, 736, 8 (nrow, ncol, nlyr)
## resolution : 50, 50 (x, y)
## extent : 459538, 496338, 3600575, 3680925 (xmin, xmax, ymin, ymax)
## coord. ref. : WGS 84 / UTM zone 11N (EPSG:32611)
## source(s) : memory
## names : last, last, last, last, last, last, ...
## min values : 1.081063e-14, 4.887112e-15, 6.871992e-11, 2.675202e-12, 9.057847e-11, 2.405132e-09, ...
## max values : 7.885379e-05, 1.000735e-03, 7.455308e-04, 1.523175e-03, 6.601711e-04, 7.408324e-05, ... 6.2 Line Segment normalization
Due to different workflow of Line Segment method applying activity raster normalization is problematic and cannot be accomplished solely by rescaling activity raster values. Thus two approaches of Line Segment are derived and implemented in time_only_norm argument.
First method of time normalization uses only time elapsed and utilizes BBmisc::normalize() on column of time data. Following computation of time elapsed, data frame last value (which is NA from diff_time() function) is set to 0 for proper normalization values.
# calculate time elapsed from all points
duration_line_id = data_proj |>
dplyr::mutate(time_elapsed = as.numeric(difftime(dplyr::lead(.data[[time_cname]]), .data[[time_cname]], units = time_unit)),line_id = dplyr::row_number()) |>
dplyr::select(line_id, time_elapsed)
# max time
max(duration_line_id$time_elapsed, na.rm = TRUE)
## [1] 2272.633
# min time
min(duration_line_id$time_elapsed, na.rm = TRUE)
## [1] 0.1666667
# last NA value set to 0 for normalization to not set lowest time value to 0
duration_line_id$time_elapsed[nrow(duration_line_id)] = 0
# first 50 normalized values
head(round(BBmisc::normalize(duration_line_id$time_elapsed, method = "range", margin = 1), 5), n =50)
## [1] 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007
## [12] 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007
## [23] 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007
## [34] 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007 0.00007
## [45] 0.00007 0.00007 0.00007 0.00007 0.00010 0.00012Second approach of normalization of Line Segment method is rasterization of SpatVector trajectory buffers. Maximal value from SpatRaster is retrieved and implemented in normalizing buffer SpatVector attribute. It enables normalizing Line Segment method without time data. It is only applicable to range normalization method.
duration_df = as.data.frame(duration_line_id)
trajectories = terra::vect(trajectories_fun(data_proj))
# create buffers over line segments
traj_buff = trajectories |>
tidyterra::select(line_id) |>
terra::buffer(bandwidth)
traj_buff$act = 1
traj_buff = dplyr::left_join(traj_buff, duration_df, by = 'line_id')
traj_buff = traj_buff[,-2]
names(traj_buff)[2] = "act"
norm_rast = terra::rasterize(traj_buff, grid_rast, field = "act", fun = "sum")
#read max val
max_norm = terra::minmax(norm_rast)[2]
# normalize
if (length(unique(traj_buff$act)) == 1){
#if vector is constant adjust BBmisc
traj_buff$act = BBmisc::normalize(traj_buff$act, method = "range",
margin = 1L, range = c(0, max(traj_buff$act) / max_norm * 2))
} else{
traj_buff$act = BBmisc::normalize(traj_buff$act, method = "range",
margin = 1L, range = c(0, max(traj_buff$act) / max_norm))
}
# max raster val
minmax(norm_rast)[2]
## [1] 7317.602
# min raster val
minmax(norm_rast)[1]
## [1] 0.1666667
# first 50 normalized values
BBmisc::normalize(traj_buff$act, method = "range", margin = 1L, range = c(0, max(traj_buff$act) / max_norm)) |> round(9) |> head(n=50)
## [1] 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## [8] 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## [15] 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## [22] 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## [29] 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## [36] 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000
## [43] 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000001
## [50] 0.000000002Latter normalization’s output resembles PO/KDE/DR normalization more than former approach though maximal value after converting SpatVector to SpatRaster is approximate to 1.
# min norm 1
minmax(norm_1)[1]
## [1] 0.00007333636
# max norm 1
minmax(norm_1)[2]
## [1] 3.219877
# min norm 2
minmax(norm_2)[1]
## [1] 0
# max norm 2
minmax(norm_2)[2]
## [1] 0.9964741
6.3 Name
Package requires a name more memorable than twsagps. Several new acronyms are proposed:
TWIGPS - Time-Weighted Integrated GPS (Time-Weighted spatial averaging methods Integrated with GPS) - personal favourite
HEXOSIS - Human (or Health) Environmental eXposure Over Space and time with Integrated GPS
HEXGPS - Human (or Health) Environmental eXposure with GPS
GPSHEAL - GPS-based Health and Environmental Assessment Locator
7 Further development
The BBmisc::normalize() and terra::writeRaster() functions possess great amount of not applied arguments that the user may consider utilizing. Incorporating those arguments as elipsis in exposure functions may be beneficial.
The spat_kde() and spat_dr() (along with kde() and kde_points()) functions features should be extented. Establishing those functions as standalone package function will enlarge and enrich twsagps application range.
Further work on twsagps package should focus on developing and enhancing documentation and error handling system.